Production Job Scheduling for Parallel Shared Memory Systems
نویسندگان
چکیده
This paper addresses open job scheduling questions for the challenge workloads that run on the large scale parallel systems at supercomputer centers. Simulation results for six recent one-month job traces from the NCSA Origin 2000 (O2K) system are used to evaluate (1) the experimentally tuned NCSA LSF* policy, (2) the FCFS-backfill policy, (3) the Priority-backfill policy with alternative priority functions and with limited preemption to provide immediate service to each arriving job, and (4) the spatial equipartitioning (EQspatial) policy with an optional modification to reduce the maximum waiting time for the largest jobs in the challenge workloads. Measurements on the O2K validate the simulation results for two of the policies. The priority-backfill policy with immediate service and a starvation-free priority measure that favors short jobs is shown to be the most promising if jobs cannot adapt to changing processor allocations at runtime, but EQspatial provides significantly better 95th percentile waiting time.
منابع مشابه
Job Management Requirements for NAS Parallel Systems and Clusters
A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance compu...
متن کاملThe Impact of Program Structure on the Performance of Scheduling Policies in Multiprocessor Systems ‡
A simple fork and join type of job structure has been extensively used for performance evaluation of processor scheduling policies in multiprocessor systems. However, parallel programs often exhibit a more complicated structure. It is not clear how the program structure affects the performance of processor scheduling policies. This paper studies the impact of the program structure on the perfor...
متن کاملLecture Notes in Computer Science 7204
The goal of this paper is to propose a methodology of the effective cost function determination for the job shop scheduling problem in parallel computing environment. Parallel Random Access Machine (PRAM) model is applied for the theoretical analysis of algorithm efficiency. The methods need a fine-grained parallelization, therefore the approach proposed is especially devoted to parallel comput...
متن کاملIntegrated scheduling: the best of both worlds
This paper presents a new paradigm for parallel job scheduling called integrated scheduling or iScheduling. The iScheduler is an application-aware job scheduler as opposed to a general-purpose system scheduler. It dynamically controls resource allocation among a set of competing applications, but unlike a traditional job scheduler, it can interact directly with an application during execution t...
متن کاملOn single-walk parallelization of the job shop problem solving algorithms
New parallel objective function determination methods for the job shop scheduling problem are proposed in this paper, considering makespan and the sum of jobs execution times criteria, however, the methods proposed can be applied also to another popular objective functions such as jobs tardiness or flow time. Parallel Random Access Machine (PRAM) model is applied for the theoretical analysis of...
متن کامل